.PU
.TH bsc 1
.SH NAME
bsc \- the bitslice compiler, v0.1

.SH SYNOPSIS
.ll +8
.B bsc
.RB [ " \-v " ]
.RB [ " \-h " ]
.RB [ " \-p funname " ]
.RB [ " \-o outfile " ]
.RB [ " file " ]

.SH DESCRIPTION
.I Bsc
produces a C source file implementing a given algorithm using the
technique of orthogonal coding, also known as bitslicing. The algorithm
is given as an ascii file, in the syntax described below. The C source
consists of one, possibly huge, C function.

The C file produced may be either compiled spearately, or included
into another C file. It uses an unsigned integer type named bsc_u
which is defined in the provided file
.I bsc.h
and is supposed to be the largest unsigned integer managed natively
by the processor (usually an unsigned long on 32-bit and more
architectures).

.SH OPTIONS
.TP
.B \-v
Prints version information on stderr, and exits.
.TP
.B \-h
Prints usage on stderr, and exits (with an error code of 1).
.TP
.B \-p funname
Sets the name of the function generated; default is
.I bsc_code.
The default prototype for the function is
.B void bsc_code(bsc_u input[], bsc_u output[]);
it is not defined in
.I bsc.h.
.TP
.B \-o outfile
Sets the name of the ouput C file; default is stdout. "-" is
considered as an alias for stdout in this case.
.TP
.B file
The file where the bsc code describing the algorithm is; default is
stdin. "-" is considered as an alias for stdin in this case.

.SH SYNTAX
Ths syntax of bsc code is rather straightforward. It consists of
declarations and expressions. All declarations and expressions
are terminated by semicolons. Whitespace may be included everywhere,
and comments use the C style:
.B /* this is a comment */
Identifiers are strings of letters and numbers, including the
underscore ('_') character; they should not begin with a number.
There is no limit on the size of an identifier, and case is relevant.

All numeric constants may be given in decimal, octal or hexadecimal,
using usual conventions (if it begins with 0x, then it is hexadecimal,
else if it begins with 0 it is octal).

There is one data type: it is called
.B bit
and is a set of binary values. A variable of bit is declared
this way:
.br
.B bit x[50];
.br
This defines a 50-bit virtual register named
.B x.
Internally, 50 machine registers will be allocated to handle its
value (in fact, as many instances of
.B x
as there are bits in machine registers).

Two special registers may be defined:
.B input
and
.B output,
that correspond to the input and output of the function.

Expressions may be the following:
.TP
.B (expression)
Each expression may be included in parenthesis.
.TP
.B x = y
This expression copies the value of
.B y
into x, and has the value of
.B y.
You may use it as in C:
.B z = (x = y);
Each expression may be a
.B lvalue
(that is, something that may be placed at the left size of an equality),
but operations may create temporary registers to store the result,
which defeats the purpose (although the code would still be legal).
Concatenation of registers is guaranteed not to spawn copies.
.TP
.B x | y, x & y, x ^ y
The three boolean bit by bit operations OR, AND and XOR.
.TP
.B ~x
The bit by bit boolean negation of
.B x.
.TP
.B [ expr1 # expr2 ]
The concatenation of two expressions; its value is a virtual register
as wide as the sum of the sizes of
.B expr1
and
.B expr2.
The concatenation of more than two expressions may be performed:
.B [ x # y # z ]
.TP
.B x |= y, x &= y, x ^= y
As in C, a boolean operation and an affectation may be combined;
.B x ^= y
is strictly equivalent to
.B x = x ^ y

Two types of functions may be defined: extraction and lookup tables:
.TP
.B ext f[3](4) { 3, 0, 1 };
This defines the function
.B f
that takes as input an expression of size 4, and produces an output
of size 3, consisting of the bits 3, 0 and 1 of the input. The bits
are numbered from 0 on the left (the most significant bit).
The function would be used as in C:
.B x = f([y # z] ^ u);
Although the point in orthogonal coding is to resolve such functions
at compile time,
.B ext
functions have in bsc a semantic of copy. Useless copies are
identified and eliminated by bsc and the C compiler.
.TP
.B tab t[3](2) { 7, 1, 4, 5 };
This defines a lookup table, which outputs 3-bit values, depending on
the 2-bit input. In this example, this means 4 possible inputs, and
therefore 4 possible 3-bit ouputs. Bsc implements such tables as
multiplexer trees.

.SH TODO
Generic functions may be implemented using your favorite preprocessor;
however, a LOOP primitive might be useful (as replicating the code
results in a longer compilation time, and cache memory problems).

There is no primitive for additions and multiplications; these should
be included in a future version.

Bsc should identify dead operations, that is operations which have
no influence whatsoever on the output. This would help reduce the
size of the code.

Bsc should reuse its local variables in its output; this would reduce
the C compiler work.

.SH CAVEATS
The C file produced tends to be huge; this increases dramatically
compilation time, and reduces efficiency, as optimal performance may
not be achieved if the code is larger than the internal cache memory.
A full unrolled DES will result in a 2.8 megabytes C file; it takes
35 minutes to compile on a Sun Ultra1/170 with Solaris 2.5, the
Sun workshop compiler 4.2 (with -xO2 optimization), and plenty of
ram (the C compiler will eat up to 110 megabytes of ram). Gnu CC
2.8.1 on a Linux Alpha station crashed on the same file, with -O2
optimization.

The
.I bsc.h
file MUST be checked: for optimal performance, the bsc_u type must
have the same size than processor registers. Moreover, the macros
BSC_U_SIZE and LOG_BSC_U_SIZE must match this size, or else the
functions given in orth.h will not work properly.

.SH AUTHOR
Thomas Pornin, thomas.pornin@ens.fr

Feel free to send any comment or bug report. The last version should
be found on:
.br
.B ftp://ftp.ens.fr/pub/dmi/users/pornin/bsc-current.tar.gz

Eli Biham and Jean Vuillemin provided useful advice and hints.
